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IN THE CLAIMS: 

Please amend the claims as follows: 

1 . (Original) A computer-implemented method for identifying correlated columns 
from database tables, comprising: 

detemriining correlation attributes for a first column and a second column from 
one or more database tables, the correlation attributes describing for each column at 
least one of the column and content of the column; 

comparing the correlation attributes from the first and second column; 

identifying similarities between the first and second column on the basis of the 
comparison; 

on the basis of the identified similarities, detemiining whether the first and second 
column are correlated; and 

merging the first and second columns only if the columns are determined to be 
correlated. 

2. (Original) The method of claim 1, wherein identifying the similarities 
comprises: 

detemriining a correlation value indicating a degree of correlation between the 
first and the second column; and 

determining whether the correlation value exceeds a predetermined threshold. 

3. (Original) The method of claim 1 , further comprising: 

if it is determined that the first and second column are correlated, displaying an 
indication to a user that the first and second column can be merged; and 

in response to user input, merging the first and second column into a single 
column. 
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4. (Original) The method of claim 1 , wherein the first column is a column of a 
first database table and the second column is a column of a second database table, the 
method further comprising: 

determining correlation attributes for N columns from the first database table and 
IVI columns from the second database table, where N and M are integers; 

comparing the correlation attributes from each of the N columns with the 
correlation attributes from each of the M columns to identify similarities between the N 
and M columns; and 

on the basis of the identified similarities, determining whether one or more of the 
N and M columns are correlated. 

5. (Original) The method of claim 4, further comprising merging each of the one 
or more of the N and M columns determined to be correlated. 

6. (Original) The method of claim 1 , further comprising: 

determining, from the one or more database tables, metadata describing 
characteristics of each column; and 

wherein the correlation attributes are determined on the basis of the detennined 
metadata. 

7. (Original) The method of claim 6, wherein the determined metadata describes 
for each column an attribute of a data value in the column. 

8. (Original) The method of claim 6, wherein the determined metadata describes 
for each column at least one of: 

(i) a label; 

(ii) a comment; 

(iii) a constraint; 

(iv) a trigger; 

(v) a name; 
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(vi) a data type; and 

(vii) a column length. 

9. (Original) The method of claim 1 , further comprising: 

determining, from the one or more database tables, statistical parameters 
associated with each of the columns; and 

wherein the correlation attributes are determined on the basis of the determined 
statistical parameters. 

10. (Original) The method of claim 9, wherein the determined statistical 
parameters describe for each column at least one of: 

(i) a minimum value; 

(ii) a maximum value; 

(iii) an average value; and 

(iv) a range of values. 

1 1 . (Original) The method of claim 1 , further comprising: 

determining, from the one or more database tables, ontological properties 
describing cognitive qualities associated with each column; and 

wherein the correlation attributes are determined on the basis of the determined 
ontological properties. 

12. (Original) The method of claim 11, wherein the determined ontological 
properties describe for each column at least one of: 

(i) a synonym; 

(ii) a parent node; and 

(iii) an ancestor node. 

1 3. (Original) The method of claim 1 1 , further comprising: 
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determining, from tlie one or more database tables, metadata describing tlie 
ontological properties. 

14. (Original) The method of claim 1 , further comprising: 

determining, from the one or more database tables, measurement units 
associated with each column; and 

wherein the correlation attributes are determined on the basis of the detemiined 
measurement units. 

1 5. (Original) The method of claim 14, further comprising: 

determining, from the one or more database tables, metadata describing the 
measurement units. 

16. (Original) The method of claim 14, wherein identifying the similarities 
comprises: 

determining whether the first and second column are associated with similar 
measurement units. 

1 7. (Original) A computer-implemented method for identifying correlated columns 
from database tables, comprising: 

detemiining metadata for at least two columns from one or more database 
tables, the metadata describing characteristics of each column; 

analyzing content from the at least two columns from the one or more database 
tables; and 

determining a degree of correlation between the at least two columns using the 
determined metadata and the analyzed content. 

1 8. (Original) The method of claim 1 7, wherein determining the degree of 
correlation comprises: 

assigning a first correlation value to the detemiined metadata; 
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assigning a second correlation value to the analyzed content, wherein the first 
and second correlation values are different; and 

calculating a total correlation value on the basis of the first and second 
correlation values. 

1 9. (Original) The method of claim 1 8, further comprising: 

merging the at least two columns if the total correlation value exceeds a 
predetermined threshold value. 

20. (Original) The method of claim 1 7, wherein analyzing the content comprises 
detemiining statistical parameters from the content of each column. 

21 . (Original) The method of claim 1 7, further comprising: 

merging the first and the at least one second column if it is determined that the 
first and at least one second column are correlated. 

22. (Currently Amended) A computer readable storage medium containing a 
program which, when executed, performs a process for identifying correlated columns 
from database tables, the process comprising: 

determining correlation attributes for a first column and a second column from 
one or more database tables, the correlation attributes describing for each column at 
least one of the column and content of the column; 

comparing the correlation attributes from the first and second column; 

identifying similarities between the first and second column on the basis of the 
comparison; 

on the basis of the identified similarities, determining whether the first and second 
column are correlated; and 

merging the first and second columns only if the columns are determined to be 
correlated 
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23. (Original) Tlie computer readable medium of claim 22, wherein identifying the 
similarities comprises: 

determining a correlation value indicating a degree of correlation between the 
first and the second column; and 

determining whether the correlation value exceeds a predetermined threshold. 

24. (Original) The computer readable medium of claim 22, wherein the process 
further comprises: 

if it is determined that the first and second column are correlated, displaying an 
indication to a user that the first and second column can be merged; and 

in response to user input, merging the first and second column into a single 
column. 

25. (Original) The computer readable medium of claim 22, wherein the first 
column is a column of a first database table and the second column is a column of a 
second database table, the process further comprising: 

determining correlation attributes for N columns from the first database table and 
M columns from the second database table, where N and M are integers; 

comparing the correlation attributes from each of the N columns with the 
correlation attributes from each of the M columns to identify similarities between the N 
and M columns; and 

on the basis of the identified similarities, determining whether one or more of the 
N and M columns are correlated. 

26. (Original) The computer readable medium of claim 25, wherein the process 
further comprises: 

merging each of the one or more of the N and M columns determined to be 
correlated. 
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27. (Original) Tiie computer readable medium of claim 22, wherein the process 
further comprises: 

determining, from the one or more database tables, metadata describing 
characteristics of each column; and 

wherein the correlation attributes are determined on the basis of the determined 
metadata. 

28. (Original) The computer readable medium of claim 27,wherein the 
determined metadata describes for each column an attribute of a data value in the 
column. 

29. (Original) The computer readable medium of claim 27, wherein the 
determined metadata describes for each column at least one of: 

(i) a label; 

(ii) a comment; 
(III) a constraint; 
(Iv) a trigger; 

(v) a name; 

(vi) a data type; and 

(vii) a column length. 

30. (Original) The computer readable medium of claim 22, wherein the process 
further comprises: 

determining, from the one or more database tables, statistical parameters 
associated with each of the columns; and 

wherein the correlation attributes are determined on the basis of the determined 
statistical parameters. 

31. (Original) The computer readable medium of claim 30, wherein the 
determined statistical parameters describe for each column at least one of: 

(i) a minimum value; 

Pages 

535555_1 



PATENT 

App. Ser. No.: 10/829,624 
Atly. DM. No. ROC920030407US1 
PSRef.No.: IBMK30407 

(ii) a maximum value; 

(iii) an average value; and 

(iv) a range of values. 

32. (Original) The computer readable medium of claim 22, wherein the process 
further comprises: 

determining, from the one or more database tables, ontological properties 
describing cognitive qualities associated with each column; and 

wherein the correlation attributes are determined on the basis of the detemriined 
ontological properties. 

33. (Original) The computer readable medium of claim 32, wherein the 
determined ontological properties describe for each column at least one of: 

(I) a synonym; 

(II) a parent node; and 
(iii) an ancestor node. 

34. (Original) The computer readable medium of claim 32, wherein the process 
further comprises: 

determining, from the one or more database tables, metadata describing the 
ontological properties. 

35. (Original) The computer readable medium of claim 22, wherein the process 
further comprises: 

detemnining, from the one or more database tables, measurement units 
associated with each column; and 

wherein the correlation attributes are determined on the basis of the detennined 
measurement units. 
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36. (Original) The computer readable medium of claim 35, wherein the process 
further comprises: 

determining, from the one or more database tables, metadata describing the 
measurement units. 

37. (Original) The computer readable medium of claim 35, wherein identifying the 
similarities comprises: 

detemriining whether the first and second column are associated with similar 
measurement units. 

38. (Currently Amended) A computer readable storage medium containing a 
program which, when executed, performs a process for identifying correlated columns 
from database tables, the process comprising: 

determining metadata for at least two columns from one or more database 
tables, the metadata describing characteristics of each column; 

analyzing content from the at least two columns from the one or more database 
tables; and 

computing dot e rm i nina a degree of correlation between the at least two columns 
using the determined metadata and the analyzed content. 

39. (Original) The computer readable medium of claim 38, wherein determining 
the degree of correlation comprises: 

assigning a first correlation value to the determined metadata; 

assigning a second correlation value to the analyzed content, wherein the first 
and second correlation values are different; and 

calculating a total correlation value on the basis of the first and second 
correlation values. 

40. (Original) The computer readable medium of claim 39, wherein the process 
further comprises: 
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merging the at least two columns if the total correlation value exceeds a 
predetermined threshold value. 

41 . (Original) The computer readable medium of claim 38, wherein analyzing the 
content comprises: 

detemriining statistical parameters from the content of each column. 

42. (Original) The computer readable medium of claim 38, wherein the process 
further comprises: 

merging the first and the at least one second column if it is determined that the 
first and at least one second column are correlated. 

43. (Original) A data processing system comprising: 

at least one database having one or more database tables; and 
a correlation manager for identifying correlated columns from the one or more 
database tables, the correlation manager being configured for: 

determining correlation attributes for a first column and a second column 
from the one or more database tables, the correlation attributes describing for 
each column at least one of the column and content of the column; 

comparing the correlation attributes from the first and second column; 

identifying similarities between the first and second column on the basis of 
the comparison; 

on the basis of the identified similarities, determining whether the first and 
second column are correlated; and . 

merging the first and second columns only if the columns are determined 
to be correlated. 

44. (Original) A data processing system comprising: 

at least one database having one or more database tables; and 
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a correlation manager for identifying correlated columns from the one or more 
database tables, the correlation manager being configured for: 

determining metadata for at least two columns from the one or more 
database tables, the metadata describing characteristics of each column; 

analyzing content from the at least two columns from the one or more 
database tables; and 

determining a degree of correlation between the at least two columns 
using the determined metadata and the analyzed content. 
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